fix(cli): add missing subprocess.run() timeouts in doctor and status by dieutx · Pull Request #3693 · NousResearch/hermes-agent

dieutx · 2026-03-29T09:31:17Z

Summary

4 subprocess.run() calls in CLI utilities have no timeout parameter and can hang indefinitely if the child process blocks. The rest of the CLI already uses timeouts (clipboard.py: 3-15s, banner.py: 5-10s, doctor.py npm audit: 30s) — these 4 are the only ones missing.

Same class of bug as #3469 (git/ripgrep subprocess calls hanging on large repos).

Root Cause

subprocess.run() defaults to timeout=None, meaning it waits forever. If the docker daemon is unresponsive, systemctl is waiting for D-Bus, or an SSH host is unreachable past its ConnectTimeout, hermes doctor and hermes status hang with no feedback.

Fix

hermes_cli/doctor.py:409 — docker info: add timeout=10
hermes_cli/doctor.py:429 — ssh ... echo ok: add timeout=15 (slightly above SSH's own ConnectTimeout=5)
hermes_cli/status.py:285 — systemctl --user is-active: add timeout=5
hermes_cli/status.py:296 — launchctl list: add timeout=5

Each call site catches subprocess.TimeoutExpired and treats it as failure, consistent with how non-zero return codes are already handled.

Tests

1 new AST-based regression test in tests/hermes_cli/test_subprocess_timeouts.py — parameterized across all 4 CLI modules that use subprocess.run(). Parses each file and asserts every call has a timeout keyword. Catches future regressions automatically.

11 pre-existing doctor/status tests + 4 new = 15 passed.

Add timeout parameters to 4 subprocess.run() calls that could hang indefinitely if the child process blocks (e.g., unresponsive docker daemon, systemctl waiting for D-Bus): - doctor.py: docker info (timeout=10), ssh check (timeout=15) - status.py: systemctl is-active (timeout=5), launchctl list (timeout=5) Each call site now catches subprocess.TimeoutExpired and treats it as a failure, consistent with how non-zero return codes are already handled. Add AST-based regression test that verifies every subprocess.run() call in CLI modules specifies a timeout keyword argument.

All subprocess.run() calls in hermes_cli/gateway.py lacked a timeout parameter. If systemctl, launchctl, loginctl, wmic, or ps blocks (e.g. D-Bus unavailable, WMI service stuck, launchd unresponsive), hermes gateway start/stop/restart/status/install/uninstall hangs indefinitely with no feedback to the user. Timeout values applied: - Lifecycle commands (start/stop/restart/enable/disable/daemon-reload, launchctl load/unload): timeout=30 - Status/query commands (is-active, loginctl show-user, launchctl list, systemctl status, journalctl, tail, ps aux, wmic): timeout=5-10 - loginctl enable-linger: timeout=10 For _is_service_running() and launchd_status(), TimeoutExpired is caught explicitly and treated as not-running, matching how non-zero return codes are already handled. All other call sites are either inside existing try/except Exception blocks (find_gateway_pids, _enable_systemd_linger, get_systemd_linger_status) or raise TimeoutExpired as a clear error instead of hanging forever. Same class of fix as NousResearch#3469 (context_references) and NousResearch#3693 (doctor/status).

teknium1 · 2026-03-30T18:17:25Z

Merged via #4009. Cherry-picked with authorship preserved, all 32 tests pass. The AST regression test is a nice touch — catches future missing timeouts automatically. Thanks @dieutx!

dlkakbs mentioned this pull request Mar 29, 2026

fix(cli): add missing subprocess.run() timeouts in gateway #3732

Open

teknium1 mentioned this pull request Mar 30, 2026

fix(cli): add missing subprocess.run() timeouts in doctor and status #4009

Merged

teknium1 closed this Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): add missing subprocess.run() timeouts in doctor and status#3693

fix(cli): add missing subprocess.run() timeouts in doctor and status#3693
dieutx wants to merge 1 commit intoNousResearch:mainfrom
dieutx:fix/cli-subprocess-timeouts

dieutx commented Mar 29, 2026

Uh oh!

teknium1 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dieutx commented Mar 29, 2026

Summary

Root Cause

Fix

Tests

Uh oh!

teknium1 commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants